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Abstract 

Reed-Muller codes encode an m-variate pol 5 momial of degree r by evaluating it on all points 
in {0,1}"'- We denote this code by RM{m,r). The minimal distance of RM{m, r) is 2’"“'" and so 
it cannot correct more than half that number of errors in the worst case. For random errors one 
may hope for a better result. 

In this work we give an efficient algorithm (in the block length n = 2’") for decoding ran¬ 
dom errors in Reed-Muller codes far beyond the minimal distance. Specifically, for low rate 
codes (of degree r = o{^/m)) we can correct a random set of (1/2 — o(l))n errors with high 
probability. For high rate codes (of degree m — r for r = o{^m/ logm)), we can correct roughly 

More generally, for any integer r, our algorithm can correct any error pattern in RM{m, m — 
{2r + 2)) for which the same erasure pattern can be corrected in RM{m,m — {r + 1)). The 
results above are obtained by applying recent results of Abbe, Shpilka and Wigderson (STOC, 
2015), Kumar and Pfister (2015) and Kudekar et al. (2015) regarding the ability of Reed-Muller 
codes to correct random erasures. 

The algorithm is based on solving a carefully defined set of linear equations and thus it is 
significantly different than other algorithms for decoding Reed-Muller codes that are based on 
the recursive structure of the code. It can be seen as a more explicit proof of a result of Abbe et 
al. that shows a reduction from correcting erasures to correcting errors, and it also bares some 
similarities with the famous Berlekamp-Welch algorithm for decoding Reed-Solomon codes. 
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1 Introduction 


Consider the following challenge: 

Given the truth table of a polynomial /(x) G F 2 [xi,.. of degree at most r, in 
which 1/2 — o(l) fraction of the locations were flipped (that is, given the evaluations 
of / over ¥2 wifh nearly half the entries corrupted), recover / efficiently. 

If fhe errors are adversarial, fhen clearly this task is impossible for any degree bound r >2, since 
there are two different quadratic polynomials that disagree on only 1/4 fraction of fhe domain. 
Hence, we turn to considering random sets of errors of size (1/2 — o(l))2'”, and we hope to recover 
/ with high probability (in this case, one may also consider the setting where each bit is indepen¬ 
dently flipped with probability 1/2 — o(l). By standard Chernoff bounds, both settings are almost 
equivalent). 

Even in the random model, if every bit was flipped wifh probability exactly 1/2, the situation is 
again hopeless: in this case the input is completely random and carries no information whatsoever 
about the original polynomial. 

It turns out, however, that even a very small relaxation leads to a dramatic improvement in 
our ability to recover the hidden polynomial: in this paper we prove, among other results, that 
even at corruption rate 1/2 — o(l) and degree bound as large as we can efficiently recover 

the unique polynomial / whose evaluations were corrupted. Note that in the worst case, given a 
polynomial of such a high degree, an adversary can flip a tiny fraction of the bits — just slightly 
more than 1/2'^ — and prevent unique recovery of /, even if we do not require an efficient 
solution; and yet, in the average case, we can deal with flipping almost half fhe bits. 

Recasting the playful scenario above in a more traditional terminology, this paper deals with 
similar questions related to recovery of low-degree multivariate polynomials from their randomly 
corrupted evaluations on F^, or in the language of coding fheory, we study the problem of decod¬ 
ing Reed-Muller codes under random errors in the binary symmetric channel (BSC). We turn to some 
background and motivation. 

1.1 Reed-Muller Codes 

Reed-Muller (RM) codes were introduced in 1954, first by Muller [Mul54] and shortly after by 
Reed [Ree54] who also provided a decoding algorithm. They are among the oldest and simplest 
codes to construct — the codewords are multivariate polynomials of a given degree, and the en¬ 
coding function is just their evaluation vectors. In this work we mainly focus on the most basic 
case where the underlying field is F = F 2 , the field of two elements, although our techniques do 
generalize to larger finite fields. Over F 2 , the Reed-Muller code of degree rinm variables, denoted 
by RM{m, r), has block length n = 2'”, rate (^j.) /2'” and its minimal distance is 2'”^'’. 

RM codes have been extensively studied with respect to decoding errors in both the worst case 
and random setting. We begin by giving a review of Reed-Muller codes and their use in theoretical 
computer science and then discuss our results. 

Background 

Error-correcting codes (over both large and small finite fields) have been extremely influential 
in the theory of computation, playing a central role in some important developments in several 
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areas such as cryptography (e.g. [Sha79] and [BF90]), theory of pseudorandomness (e.g. [BVIO]), 
probabilistic proof systems (e.g. [BFL91, Sha92] and [ALM+98]) and many more. 

An important aspect of error correcting codes that received a lot of attention is designing effi¬ 
cient decoding algorithms. The objective is to come up with an algorithm that can correct a certain 
amounts of errors in a received word. There are two settings in which this problem is studied: 

Worst case errors: This is also referred to as errors in the Hamming model [FlamSO]. Here, the 
algorithm should recover the original message regardless of the error pattern, as long as there 
are not too many errors. The number of errors such a decoding algorithm can tolerate is upper 
bounded in terms of the distance of the code. The distance of the code C is the minimum Hamming 
distance of any two codewords in C. If the distance is d, then one can uniquely recover from at most 
d — 1 erasures and from \_{d — 1)/2\ errors. For this model of worst-case errors it is easy to prove 
that Reed-Muller codes perform badly. They have relatively small distance compared to what 
random codes of the same rate can achieve (and also compared to explicit families of codes). 

Another line of work in Hamming's worst case setting concerns designing algorithms that can 
correct beyond the unique-decoding bound. Here there is no unique answer and so the algorithm 
returns a list of candidate codewords. In this case the number of errors that the algorithm can tol¬ 
erate is a parameter of the distance of the code. This question received a lot of attention and among 
the works in this area we mention the seminal works of Goldreich and Levin on Hadamard Codes 
[GL89] and of Sudan [Sud97] and Guruswami and Sudan [GS99] on list decoding Reed-Solomon 
codes. Recently, the list-decoding question for Reed-Muller codes was studied by Gopalan, Kli- 
vans and Zuckerman [GKZ08] and by Bhowmick and Lovett [BL15], who proved that the list 
decoding radius^ of Reed-Muller codes, over F 2 , is at least twice the minimum distance (recall 
that the unique decoding radius is half that quantity) and is smaller than four times the minimal 
distance, when the degree of the code is constant. 

Random errors: A different setting in which decoding algorithms are studied is Shannon's 
model of random errors [Sha48]. In Shannon's average-case setting (which we study here), a 
codeword is subjected to a random corruption, from which recovery should be possible with high 
probability. This random corruption model is called a channel. The two most basic ones, the Binary 
Erasure Channel (BEG) and the Binary Symmetric Channel (BSC), have a parameter p (which may 
depend on n), and corrupt a message by independently replacing, with probability p, the symbol 
in each coordinate, with a "lost" symbol in the BEC(p) channel, and with the complementary 
symbol in the BSC(p) case. In his paper Shannon studied the optimal trade-off achievable for 
these charmels (and many other channels) between the distance and rate. Eor every p, the capacity 
of BEC(p) is 1 — p, and the capacity of BSC(p) is 1 — h{p), where h is the binary entropy function.^ 
Shannon also proved that random codes achieve this optimal behavior. That is, for every 0 < e 
there exist codes of rate 1 — h{p) — e for the BSC (and rate 1 — p — e for the BEG), that can decode 
from a fraction p of errors (erasures) with high probability. 

Eor our purposes, it is more convenient to assume that the codeword is subjected to a fixed 
number s of random errors. Note that by the Chernoff-Hoeffdmg bound, (see e.g., [AS92]), the 
probability that more than pn + Ci>{^/pn) errors occur in BSC(p) (or BEC(p)) is o(l), and so we 
can restrict ourselves to the case of a fixed number s of random errors, by setting the corruption 
probability to be p = s/ n. We refer to [ASW15] for further discussion on this subject. 

^The maximum distance rj for which the number of code words within distance )/ is only polynomially large (in n). 
^h{p) = —plog 2 (p) — (1 — p) log 2 (l — p), for p 6 (0,1), and h{0) = h{l) = 0. 
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Decoding erasures to decoding errors 

Recently, there has been a considerable progress in our understanding of the behavior of Reed- 
Muller codes under random erasures. In [ASW15], Abbe, Shpilka and Wigderson showed that 
Reed-Muller codes achieve capacity for the EEC for both sufficiently low and sufficiently high 
rates. Specifically, they showed that RM{m,r) achieves capacity for the EEC for r = o{m) or 
r > m — o^^mj logm). More recently, Kumar and Pfister [KP15] and Kudekar, Mondelli, §a§oglu 
and Urbanke [KM§U15] independently showed that Reed-Muller codes achieve capacity for the 
EEC in the entire constant rate regime, that is r G [m/2 — 0{^/m),m/2 -|- These regimes 

are pictorially represented in Eigure 1. 



Eigure 1: Regime of r for which RM{m, r) is known to achieve capacity for the EEC 

Another result proved by Abbe et al. [ASW15] is that Reed-Muller codes RM{m,m — 2r — 2) 
can correct any error pattern if the same erasure pattern can be decoded in RM(m, m — r — 1). This 
reduction is appealing on its own, since it connects decoding from erasures — which is easier in 
both an intuitive and an algorithmic manner — with decoding from errors; but its importance is 
further emphasized by the progress made later by Kumar and Pfister and Kudekar et al., who 
showed that Reed-Muller codes can correct many erasures in the constant rate regime, right up to 
the channel capacity. 

This result show that RM(m, m — (2r -|- 2)) can cope with most error patterns of weight (1 — 
o(l))( ™j.), which is the capacity of RM{m,m — [r + 1)) for the EEC. While this is polynomially 
smaller than what can be achieved in the Shannon model of errors for random codes of the same 
rate, this number is still much larger (super-polynomial) than the distance (and the list-decoding 
radius) of the code, which is 2^''+^. Also, since RM (m, y — o{y/m)) can cope with — o(l))- 
fraction of erasures, this translation implies that RM{m,o{^/m)) can handle that many random 
errors. 

However, a shortcoming of the proof of Abbe et al. for the ESC is that it is existential. In 
particular it does not provide an efficient decoding algorithm. Thus, Abbe et al. left open the 
question of coming up with a decoding algorithm for Reed-Muller codes from random errors. 

1.2 Our contributions 

In this work we give an efficient decoding algorithm for Reed-Muller codes that matches the pa¬ 
rameters given by Abbe et al. Eollowrng the aforementioned results about the erasure correcting 
ability of Reed-Muller codes, the results can be partitioned into the low-rate and the high-rate 
regimes. We begin with the result for the low rate case. 

Theorem 1 (Low rate, informal). Let r < 5^/mfor a small enough 5. Then, there is an efficient algorithm 
that can decode RM{m,r) from a random set o/(l — o(l)) ■ ^^^ors. In particular, ifr = o{^pm). 
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the algorithm can decode from (^ — o(l)) • 2”* errors. The running time of the algorithm is O(n^) and it 
can be simulated in NC. 

For high rate Reed-Muller codes, we cannot hope to achieve such a high error correction capa¬ 
bility as in the low rate case, even information theoretically. We do give, however, an algorithm 
that corrects many more errors (a super-polynomially larger number) than what the minimal dis¬ 
tance of the code suggests, and its running time is also nearly linear in the block length of the 
code. 

Theorem 2 (High rate, informal). Let r = o^y/mj log m). Then, there is an efficient algorithm that can 
decode RM{m,m — {2r + 2)) from a random set of {1 — o{l)){!fj.) errors. Moreover, the running time of 
the algorithm is 2^ ■ poly((<,.)) and it can be simulated in NC. 

Recall that the block length of the code is n = 2’”, and thus the running time is near linear in n 
when r = o{m). 

A general property of our algorithm is that it corrects any error pattern in RM{m, m — 2r — 2) 
for which the same erasure pattern in RM{m, m — r — 1) can be corrected. Stated differently, if an 
erasure pattern can be corrected in RM{m, m — r — 1) then the same pattern, where the "lost" sym¬ 
bol is replaced with arbitrary 0/1 values, can be corrected in RM{m, m — (2r -|- 2)). This property 
is useful when we know RM{m, m — r — 1) can correct a large set of erasures with high probabil¬ 
ity, that is, when m — r — 1 falls in the red region in Figure 1. Thus, our result has implications also 
beyond the above two instances. In particular, it may be the case that our algorithm performs well 
for other rates as well. For example, consider the following question and the theorem it implies. 

Question 3. Does RM{m, m — r — 1) achieve capacity for the EEC? 

Theorem 4 (informal). For any value r for which the answer to Question 3 is positive, there exists an 
efficient algorithm that decodes RM{m,m — 2r — 2) from a random set of (1 — errors with 

probability (1 — o(l)) (over the random errors). Moreover, the running time of the algorithm is 2^ ■ 

poly ((?,)). 

Recall that Abbe et al. [ASW15] also proved that the answer to Questions is positive for 
r = m — ofm) (that is, for RM{m,o{m))) but this case does not help us as we need to consider 
RM{m,m — {2r + 2)) and m — {2r + 2) < 0 in this case. The coding theory community seems 
to believe the answer to Question 3 is positive, for all values of r, and conjectures to that effect 
were made^ in [CF07, AriOS, MHU14]. Recent simulations have also suggested that the answer 
to the question is positive [AriOS, MHU14]. Thus, it seems natural to believe that the answer 
is positive for most values of r, even for r = ©(m). As a conclusion, the belief in the coding 
theory community suggests that our algorithm can decode a random set of roughly {(ff) errors 
in RM{m,m — (2r -|- 2)). For example, for r = p ■ m, where p < 1/2, the minimal distance of 
RM(m, m — (2r -|- 2)) is roughly 2^P™ whereas our algorithm can decode from roughly 2f^Pl^ ran¬ 
dom errors (assuming the answer to Question 3 is positive), which is a much larger quantity for 
every p < 1/2. 

In Section 3, we also present an abstraction of our decoding procedure that may be applica¬ 
ble to other linear codes. This is a generalization of the abstract Berlekamp-Welsch decoder or 

^The belief that RM codes achieve capacity is much older, but we did not trace back where it appears first. 
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"error-locating pairs" method of Duursma and Kdtter [DK94] that connects decodable erasure 
patterns on a larger code to decodable error patterns. A specific instantiation of this was observed 
by Abbe et al. [ASW15] by connecting decodable error patterns of any linear code C to decodable 
erasure patterns of an appropriate "tensor" C of C (by essentially embedding these codes in a 
large enough RM code). Although Abbe et al. did not provide an efficient decoding algorithm, 
the algorithm we present directly applies here (Section 3.2). The abstraction of the "error-locating 
pairs" method presented in Section 3 should hopefully be applicable in other contexts too, espe¬ 
cially considering the generality of the results of [KP15, KM§U15]. 

1.3 Related literature 

In Section 1.1 we surveyed the known results regarding the ability of Reed-Muller codes to correct 
random erasures. In this section we summarize the results known about recovering RM codes 
from random errors. 

Once again, it is useful to distinguish between the low rate and the high rate regime of Reed- 
Muller codes. We shall use d to denote the distance of the code in context. For RM(m,r) codes, 
d = 2'”-''. 

In [KriZO], the majority logic algorithm of [Ree54] is shown to succeed in recovering all but 
a vanishing fraction of error patterns of weight up to d log d/4 for all RM codes of positive rate. 
In [Dum06], Dumer showed for all r such that min(r, m — r) = c<;(logm) that most error patterns 
of weight at most (dlogd/2) • (1 — ^^) can be recovered in RM{m,r). To make sense of the 
parameters, we note that when r = m — a;(logm) the weight is roughly (dlogd/2). To compare 
this result to ours, we first consider the case when r = m — o{^m/ logm). Here the algorithm 

of [Dum06] can correcf roughly random errors in RM(m, r) whereas Theorem 2 gives 

an algorithm for correcting roughly m‘’(V'”Aog m) ^ (dlogd)®*^^®®"*) random errors. 

Further, even for the case r = (1 — p)m, where p < 1/2 is a constant, the bound in the 
above result of [Dum06] is equal to O(dlogd). On the other hand, assuming a positive answer 
to Question 3, Theorem 4 implies an efficient decoding algorithm for RM(ra, (1 — p)m) that can 
decode from, roughly, (i^^j) = random errors, for this case. 

We now turn to other regimes of parameters, specifically RM codes of low rate. For the special 
case of r = 1,2, [HKL05] shows fhaf RM(ra, r) codes are capacity-achieving. In [SP92], it is shown 
that RM codes of fixed order (i.e., r = 0(1)) can decode most error patterns of weight up to 
\n{l — \/c{2'' — l)m’'/nr\), where c > ln(4). In [ASW15], Abbe et al. settled the question for 
low order Reed-Muller codes proving that RM{m,r) codes achieve capacity for the BSC when 
r = o{m) [ASW15]. We note however that all the results mentioned here are existential in nature 
and do not provide an efficient decoding algorithm. 

A line of work by Dumer [Dum04, DS06] based on recursive algorithms (that exploit the recur¬ 
sive structure of Reed-Muller codes), obtains algorithmic results mainly for low-rate regimes. In 
[Dum04], it is shown that for a fixed degree, i.e., r = 0(1), an algorithm of complexity 0(n log n) 
can correct most error patterns of weight up to n(l/2 — e) given that e exceeds In [Dum06], 

this is improved to errors of weight up to ^n{l — (4m/d)^/^"^) for all r = o(logm). The case 
r = co{logm) is also covered in [Dum06], as described above. 

We note that all the efficient algorithms mentioned above (both for high- and low-rate) rely 
on the so called Plotkin construction of the code, that is, on its recursive structure (expand¬ 
ing an m-variate polynomial according to the m-th variable f{xi ,..., Xm) = x,„g{xi ,..., x,„_i) + 
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Our results: 
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Figure 2: Comparison with [Dum04, DS06, Dum06] 


h{xi,... whereas our approach is very different. 

We summarize and compare our results with [Dum04, DS06, Dum06] for various range of 
parameters in Figure 2 (degree is r and distance is d = 2'”^''). The dotted region in Figure 2 corre¬ 
sponds to the uncovered region in Figure 1 beyond m/2, via the connection given in Theorem 4. 

1.4 Notation and terminology 

Before explaining the idea behind the proofs of our results we need to introduce some notation 
and parameters. We shall use the same notation as [ASW15]. 

• We denote by M(m, r) the set of m-variate monomials over F 2 of degree at most r. 

• For non-negative integers r < m, RM{m, r) denotes the Reed-Muller code whose codewords 
are the evaluation vectors of all multivariate polynomials of degree at most r on m boolean 
variables. The maximal degree r is sometimes called the order of the code. The block length 

of the code is n = 2'”, the dimension k = k{m,r) = Yli=o (7) (<r)' distance 

d = d{m, r) = 2"*^’'. The code rate is given by R = k{m, r) / n. 

• We use E{m,r) to denote the "evaluation matrix" of parameters m, r, whose rows are indexed 
by all monomials in M(m, r), and whose columns are indexed by all vectors in F^. The value 
at entry (M, u) is equal to M(u). For u G F^’, we denote by u'’ the column of E{m, r) indexed 
by u, which is a k-dimensional vector, consisting of all evaluations of degree < r monomials 
at u. For a subset of columns U C F^ we denote by Lf'' the corresponding submatrix of 
E{m,r). 

• E{m, r) is a generator matrix for RM{m, r). The duality property of Reed-Muller codes (see, 
for example, [MS77]) states that E{m,m — r — 1) is a parity-check matrix for RM{m,r), or 
equivalently, E(m,r) is a parity-check matrix for RM{m, m — r — 1). 

• We associate with a subset U Q F^ its characteristic vector Itj G F^. We often think of the 
vector l[i as denoting either an erasure pattern or an error pattern. 
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• For a positive integer n, we use the standard notation [n] for the set {1,2,..., n}. 

We next define what we call the degree-r syndrome of a set. 

Definition 5 (Syndrome). Let r < m be two positive integers. The degree-r syndrome, or simply r- 
syndrome of a set U — {ui,..., Uf} C is the -dimensional vector a. whose entries are indexed by 
all monomials M G M(m, r), such that 


! = 1 

Note that this is nothing but the syndrome of the error pattern G F^ in the code RM{m, m — 
r — 1) (whose parity check matrix is the generator matrix of RM{m, r)). 

1.5 Proof techniques 

In this section we describe our approach for constructing a decoding algorithm. Recall that the 
algorithm has the property that is decodes in RM{m,m — 2r — 2) any error pattern 11 which is 
correctable from erasures in RM{m,m — r — 1). Such patterns are characterized by the property 
that the columns of E{m,r) corresponding to the elements of U are linearly independent vectors. 
Thus, it suffices to give an algorithm that succeeds whenever the error pattern l[j gives rise to 
such linearly independent columns, which happens with probability 1 — o(l) for the regime of 
parameters mentioned in Theorem 1 and Theorem 2. 

So let us assume from now on that the error pattern lu corresponds to a set of linearly indepen¬ 
dent columns in E (m, r). Notice that by the choice of our parameters, our task is to recover U from 
the degree (2r -|- l)-syndrome of U. Furthermore, we want to do so efficiently. For convenience, 
letf = |Lf| = (l-o(l))(-). 

Recall that the degree-(2r -|- 1) syndrome of U is the vector a such that for every 

monomial M G M(m, 2r-|-l), = Li[=iM(u,). Imagine now that we could somehow find 

degree-r polynomials ffxi ,..., satisfying fi{uj) = 5iy. Then, from knowledge of a and, say, 

/i, we could compute the following sums: 

! = 1 

Indeed, if we know a and /i then we can compute each cr^, as it just involves summing several 
coordinates of x (since deg(/i ■ xi) < r + 1). We now observe that 

t 

■ ^r)(w) = (/i • 3 :£)(ui) = (ul)^ 
i=l 

In other words, knowledge of such an /i would allow us to discover all coordinates of ui and in 
particular, we will be able to deduce Ui, and similarly all other u, using f. 

Our approach is thus to find such polynomials f. What we will do is set up a system of linear 
equations in the coefficients of an unknown degree r polynomial / and show that /i is the unique 
solution to the system. Indeed, showing that /i is a solution is easy and the hard part is proving 
that it is the unique solution. 
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To explain how we set the system of equations, let us assume for the time being that we actually 
know ui. Let / = YuMeM(m,r) ■ M, where we think of {cm} as unknowns. Consider the following 
linear system: 

1- E/(u/) = /(ui) = 1, 

! = 1 

2. J^if-M){ui) = M(ui), for all M G M(m,r). 

! = 1 

t 

3. E(/-M - (x£ + (ui)£ + l))(u,) = M(ui) for every £ G [m] and for all M G M(m,r). 

i=l 

In words, we have a system of 2 + (^j.) + ra • (^^) equations in (^j.) variables (the coefficients of 
/). Observe that / = /i is indeed a solution to the system. To prove that it is the unique solution 
we rely on the fact that the columns of LT” are linearly independent and hence expressing u[ as a 
linear combination of those columns can be done in a unique way. 

Now we explain what to do when we do not know ui. Let v = {v-[,... ,Vm) G F^. We modify 
the linear system above to: 

1- E/(u;) = /(v) = 1, 

! = 1 

2. X^(/-M)(u,) = M(v) for all M G M(m, r). 

i=l 

t 

3. ■ M-{xi + Vi + l)){ui) = M(v) for alH G [?«] and M G M(?«,r). 

Now the point is that one can prove that if a solution exists then it must be the case that v is an 
element of U. Indeed, the set of equations in item 2 implies that v’’ is in the linear span of the 
columns of U’’. The linear equations in item 3 then imply that v must actually be in the set U. 
Notice that what we actually do amounts to setting, for every v G F^, a system of linear 

equations of size roughly Such a system can be solved in time poly ■ Thus, when we 

go over all v G F^ we get a running time of 2’" • poly as claimed. 

Our proof can be viewed as an algorithmic version of the proof of Theorem 1.8 of Abbe et al. 
[ASW15]. That theorem asserts that when the columns of 11^ are linearly independent, the (2r +1)- 
syndrome of U is unique. In their proof of the theorem they first use the (2r)-syndrome to claim 
that if V is another set with the same (2r)-syndrome then the column span of Lf'^ is the same as that 
of y''. Then, using the degree (2r + 1) monomials they deduce that U = V. This is similar to what 
our linear system does, but, in contrast, [ASW15] did not have an efficient algorithmic version of 
this statement. 

2 Decoding Algorithm For Reed-Muller Codes 

We begin with the following basic linear algebraic fact. 
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Lemma 6. Let ui,..., Uf G such that {u![,..., Uf} are linearly independent. Then, for every i G [f], 
there exists a polynomial f so that for every j G [f]. 


fi{uj) = = 



if i = i 
otherwise. 


For completeness, we give the short proof. 

tx('”) 

Proof. Consider the matrix LP G F 2 whose f-th row is u-. A polynomial f which satisfies 
the properties of the lemma is a solution to the linear system Lf'^x = e,, where e/ G F 2 is the f-th 
elementary basis vector (that is, {efi = and the unknowns are the coefficients of//. By 
the assumption that U is of full rank, indeed there exisFs a solution. □ 

The algorithm would proceed by making a guess v = {vi,... ,Vm) G F 2 " for one of the error 
locations. If we could come up with an efficient way to verify that the guess is correct, this would 
immediately yield a decoding algorithm. We shall verify our guess by using the dual polynomials 
fi,..., ft described above. We shall find them by solving a system of linear equations that can be 
constructed from the (2r + l)-syndrome of {ui,..., We will need the following crucial, yet 
simple, observation. 

Observation 7. Let f he any m-variate polynomial of degree at most 2r + 1, and ui,..., u/ G F^. Then, 
the sum /(u/) can be computed given the (2r + l)-syndrome of {ui,..., u/}, in time O (( 2 ^+ 1 )) • 

Proof. For any M G M(?«,2r + 1), denote = E[=i-M(u/) (so that a = MeM{M, 2 r+i) is 
precisely the syndrome of {ui,..., Uf}). Write / = EM 6 M(m, 2 r+i) where Cm S F 2 , then 

= 1] E Cm-M(u/) 

1=1 !=1 M6M(m,2r+l) 

^M(u/)j = cmocm- □ 

!=1 / M6M(m,2r+l) 

The following lemma shows how to verify a guess for an error location. It is the main ingre¬ 
dient in the analysis of our algorithm and the reason why it works. Basically, the lemma gives 
a system of linear equations whose solution enables us to decide whether a given v G F 2 is a 
corrupted coordinate or not, without knowledge of the set of errors U but only of its syndrome. 
In a sense, this lemma is analogous to the Berlekamp-Welch algorithm, which also gives a system 
of linear equations whose solution reveals the set of erroneous locations ([WB86], and see also the 
exposition in Chapter 13 of [GRS14]). 

Lemma 8 (Main Lemma). Let ui,..., u/ G F^ such that {u!(,..., u\} are linearly independent, and v = 
{vi,..., Vm) G F 2 . Suppose there exists a multilinear polynomial f G F 2 [xi,..., Xm] with deg(/) < r 
such that for every monomial M G M(m, r), 

2- E/(w) = /(v) = 1, 

! = 1 

2. l^{f ■ M){ui) = M{v),and 


I] Cm 

M6M(m,2r+l) 
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t 

3. '^{f ■ M-{x£ + vi + l)){ui) = M{\) for every £ ^ [m\. 

1=1 

Then there exists i E [f] such that v = u,. 

Observe that if indeed v = u; for some i E [t], then the polynomial f guaranteed by Lemma 6 
satisfies those equations. Hence, the lemma should be interpreted as saying the converse: that 
if there exists such a solution, then v = u, for some i. Further, given the (2r + l)-syndrome of 
{ui,..., Uf} as input. Observation 7 shows that each of the above constraints are linear constraints 
in the coefficients of /. Thus, finding such an / is merely solving a system of O linear 

equations in unknowns and can be done in poly time. 

Proof of Lemma 8. Let / = {/ | /(uy) = l}. Note that by item 1 it holds that } fQ). 

Subclaim 9. E u ■ = v'. 

'6/ 

Proof Let M G M(m,r). We show that M{\Xi) = M(v), i.e., that the M'th 
coordinate of Ejgj is equal to that of v''. Indeed, as / satisfies the constraints in 
item 2, 


'M(v) = Y^{f ■ M)(u,) = Y^{f ■ M)(u,) + Y^{f ■ M)(u/) = ^M(u,). □(Subclaim) 

1=1 ie] i^j iej 

For any £ E [m], let = {/ | /(uy) = 1 and (uy)^ = w} ^ /■ Observe that this definition implies 
that for every j E [t], the index j is in Ji if and only if (/ • {x£ + V£ + 1)) (uy) = 1. Using a similar 
argument, we can show the following. 

Subclaim 10. For every £ E [m], 



ieje 


( 11 ) 


Proof Again, for any M G M(m, r) the constraints in item 3 imply that 

t 

M(v) = + + = E M( U;). □(Subclaim) 

!=l ieje 


From the above claims, 

''' = E“: = E'': = "- = E-i 

leJ l6/l l6/m 

By the linear independence of {up..., u[}, it follows that J = Ji = h = ■■■ — Jm- Indeed, there 
is a unique linear combination of {up... ,u[} that gives v''. The only vector which can be in the 
(non-empty) intersection 0^=1 Jk is v, and so there exists i E [f] so that u/ = v. □ 

Lemma 8 implies a natural algorithm for decoding from t errors indexed by vectors {ui,... , Uf}, 
assuming {up ..., Uj } are linearly independent, that we write down explicitly in Algorithm 1. 
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Algorithm 1: Reed-Muller Decoding 

Input: A (2r + l)-syndrome of {ui,..., uj} 

1 : = 0 

2: for all V = {vi,... ,Vm) G F 2 do 

3: Solve for a polynomial / G F 2 [xi,..., x^] of degree at most r: 

• E /(u;) = /(v) - 1, 

! = 1 

• E (/ • 'M)(u,) = M(v) for all M G M(m,r). 

! = 1 

t 

• E (/ ■ + l))(u;) = M(v) for all £ G [m] and M G M(?n,r). 

! = 1 

4: if there is a polynomial / that satisfies the above system of equations then 

5: Add V to the set £. 

6 : return the set £ as the error locations. 


Theorem 12. Given the {2r + 1)-syndrome oft unknown vectors {ui, ...,ut} C F™ such that 
{u[,- ■ -/Uj} are linearly indeipendent, Algorithm 1 outputs {ui,.. .,ut}, runs in time 2*" • poly((<^)) 
and can be realized using a circuit of depth poly(m) = poly(logn). 

Proof. The algorithm enumerates all vectors in F^, and for each candidate v checks whether there 
exists a solution to the linear system of poly((^^)) equations in poly((<j,)) unknowns given in 
Lemma 8. Observation 7 shows that this system of linear equations cand)e constructed from the 
(2r + l)-syndrome in poly((<,.)) time. 

By Lemma 6 and Lemma 8, a solution to this system exists if and only if there is i G [t] so that 
v = u,. The bound on the running time follows from the description of the algorithm. Further¬ 
more, all 2^ = n linear systems can be solved in parallel, and each linear system can be solved 
with an NC^ circuit (see, e.g., [MV97]). □ 

Observe that the the proof of correctness for Algorithm 1 is valid, for any value of r, whenever 
the set of error locations {ui,..., Uf} satisfies the property that {\x\, ... , uj'} are linearly indepen¬ 
dent. Therefore, we would like to apply Theorem 12 in settings where {ui,..., Uf} are linearly 
independent with high probability. 

For the constant rate regime, Kumar and Pfister [KP15] and Kudekar, Mondelli, §a§oglu and 
Urbanke [KM§U15] proved that RM{m, m — r — 1) achieves capacity for r = m/2 ± 0{^/m). 

Theorem 13 ([KP15], Theorem 23). Let r < m be integers such that r = m/2 ± 0{\/m). Then, for 

f = (1 — o(l))( ™j.), with probability 1 — o(l),/or a set of vectors {ui,... ,ut} C F™ chosen uniformly at 
- 

random, it holds that {u^..., u^} are linearly independent over F^-" . 

Letting r = m/2 — o(^\pm) and looking at the code RM{m,m — 2r — 2) = RM{m,o{\/m)) so 
that {'ff) = (1/2 — 0 ( 1 ) )2’”, we get the following statement, stated earlier as Theorem 1. 

Corollary 14. There exists a (deterministic) algorithm that is able to correct t = (1/2 — o(l))2'” random 
errors in RM{m,o{\/m) with probability 1 — o(l). Thealgorithm runsin f/me2'” ■ f(m/ 2 ^o(ym)) E 
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Alternatively, we can pick r = mil. — 0(\/m) and correct c • 2’" random errors in the code 
RM{m,0{\/m)), where c is some positive constant that goes to zero as the constant hidden under 
the big O increases. 

For the high-rate regime, recall the following capacity achieving result proved in [ASW15]: 


Theorem 15 ([ASW15], Theorem 4.5). Let e > 0, r < m be two positive integers and t < 
^m-iog((<^^Aiog(i/£)^^ rfjen, with probability at least 1 — e, for a set of vectors {ui,.. .,Uf} C Tf cho- 

sen uniformly at random, it holds that {\i [,..., } are linearly independent over F 2 -'' . 

Using Theorem 15, we apply Theorem 12 to obtain the following corollary, which was stated 
informally as Theorem 2. 


Corollary 16. Let e > 0, and r < m be two positive integers. Then there exists a (deterministic) algo¬ 


rithm that is able to correct t = 


iog((<,^)^ iog(i/£)^ _ ^ -fandom errors in RM{m,m — {2r + 2)) with 


probability at least 1 — e. The algorithm runs in time 2”’ • poly (^(<^)^ • 


If r = o{^/W/\ogm), the bound on f is (1 — as promised. 

More generally, a positive answer to Questions is equivalent to {vl\,. . .,Ut} for f=(l — 
0 ( 1 )) (™j.) being linearly independent with probability 1 — o(l) (see Corollary 2.9 in [ASW15]), and 
thus we also obtain the following corollary, which was stated informally as Theorem 4. 


Corollary 17. Let r < mbe two positive integers. Suppose that RM{m, m — r — 1) achieves capacity for 
the BEC. Then there exists a (deterministic) algorithm that is able to correct (1 — o(l)) (<^) random errors 

in RM(m, m— {2r + 2)) with probability 1 — o(l). The algorithm runs in time 2’" • poly (^(<,.)^ ■ 

We note that for all values of r, 2"* • poly is polynomial in the block length n = 2’^, and 

when r = o{m) this is equal to . 


3 Abstractions and Generalizations 

3.1 An abstract view of the decoding algorithm 

In this section we present a more abstract view of Algorithm 1, in the spirit of the works by Pel- 
likaan, Duursma and Kotter ([Pel92, DK94]) which abstract the Berlekamp-Welch algorithm (see 
also the exposition in [SudOl]). Stated in this way, it is also clear that the algorithm works also 
over larger alphabets, so we no longer limit ourselves to dealing with binary alphabets. As shown 
in [KP15], Reed-Muller codes over F(j (sometimes referred to as Generalized Reed-Muller codes) also 
achieve capacity in the constant rate regime. 

We begin by giving the definition of a (pointwise) product of two vectors, and of two codes. 

Definition 18. Let u, v G lE'f Denote fcy u * v G F” the vector (uivi,..., u„v„). For A,B Q F” we 
similarly define A*B = {u*v | uGA,vG B}. 

Following the footsteps of Algorithm 1, we wish to decode, in a code C, error patterns which 
are correctable from erasures in a related code N, through the use of an error-locating code E. Under 
some assumptions on C, N and E, we can use a similar proof in order to do this. 
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Theorem 19. Let E,C,N C F" be codes with the following properties. 

1. E*CCN 

2. For any pattern lu that is correctable from erasures in N, and for any coordinate i ^ U there exists 
a codeword e G £ such that ej = Ofor all j Ell and e, = 1. 

Then there exists an efficient algorithm that corrects in C any pattern lu, which is correctable from erasures 
in N. 

To put things in perspective, earlier we set C = RM{m, m — 2r — 2), N = RM{m, m — r — T) 
and £ = RM{m,r + 1). It is immediate to observe that item 1 holds in this case, and item 2 
is guaranteed by Lemma6: Indeed, consider the error pattern U = {ui,.. .,ut} and the dual 
polynomials {/;}and let v ^ ii be any other coordinate of the code. If there exists j E [f] such 
that fj{v) = 1, we can pick the codeword g = fj ■ {1 + xi + v^), where £ is some coordinate such 
that V£ 7 ^ (uy)f. g has degree at most r + 1 and so it is a codeword in £, and it can be directly 
verified that it satisfies the conditions of item 2. If fj{v) = 0 for all j, we can pick g = 1 — fi¬ 
ll is also worth pointing out the differences between our approach and the abstract Berlekamp- 
Welch decoder of Duursma and Kotter: They similarly set up codes £, C and N such that E * C Q 
N. However, instead of item 2, they require that for any e G £ and cGC, ife*c = 0 then e = 0 or 
c = 0 (or similar requirements regarding the distances of £ and C that guarantee this property). 
This property, as well as the distance properties, do not hold in the case of Reed-Muller codes. 

Turning back to the proof of Theorem 19, the algorithm and the proof of correctness turn out 
to be very short to describe in this level of generality. Given a word y G F”, the algorithm would 
solve the the linear system a * y = b, in unknowns a G £ and b G N. Under the hypothesis of the 
theorem, we show that common zeros of the possible solutions for a determine exactly the error 
locations. Once the locations of the errors are identified, correcting them is easy: we can replace 
the error locations by the symbol'?' and use an algorithm which corrects erasures (this can always 
be done efficiently, when unique decoding is possible, as this merely amounts to solving a system 
of linear equations). The algorithm is given in Algorithm 2. 


Algorithm 2 : Abstract Decoding Algorithm 

Input: received word y G F” such that y = c + e, with c G C and e is supported on a set U 
1 : Solve for a G £, b G N, fhe linear system a * y = b. 

2: Let {ai,..., afc} be a basis for the solution space of a, and let E denote the common zeros of 

{a I i e [k]}. 

3: For every j E £, replace yy with '?', to get a new word y'. 

4: Correct y' from erasures in C. 


Note that in Theorem 19 we assume that the error pattern U is correctable from erasures in N, 
whereas Algorithm 2 first computes a set of error locations £ and then corrects y' from erasures in 
C. Thus, the proof of Theorem 19 can be divided into two steps. The first, and the main one, will 
be to show that £ = U. The second, which is merely an immediate observation, will be to show 
that U is also correctable from erasures in C. We begin with the second part: 

Lemma 20. Assume the setup of Theorem 19, and let U be any pattern which is correctable from erasures 
in N. Then U is also correctable from erasures in C. 
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Proof. We may assume that U 7 ^ 0, as otherwise the statement is trivial. Suppose on the contrary 
that U is not correctable from erasures in C, that is, there exists a non-zero codeword c G C 
supported on U. For any a G E, we have that a * c is a codeword of N which is supported on a 
subset of U. In order to reach a contradiction, we want to pick a G E so that a * c is a non-zero 
codeword of N, which contradicts the assumption that U is correctable from erasures in N. 

Pick f G U so that c; 0. Observe that if U is correctable from erasures in N then so is Lf \ {/}. 
By item 2 in Theorem 19 with respect to the set Lf \ {/} there exists a G E with a, = 1. Thus, in 
particular a * c is non-zero. □ 

We now prove that main part of Theorem 19, that is, that under the assumptions stated in the 
theorem. Algorithm 2 correctly decodes (in C) any error pattern that is correctable from erasures 
inN. 

Proof of Theorem 19. Write y = c -|- e, so that c G C is the transmitted codeword and e is supported 
on the set of error locations U. As noted above, by Lemma 20 it is enough to show that under the 
assumptions of the theorem (in particular, that U is correctable from erasures in N), the set of error 
locations 8 computed by Algorithm 2 equals U. 

In the following two lemmas, we argue that any solution a for the system vanishes on the error 
points, and then that for every other index i, there exists a solution whose f-th entry is non-zero 
(and so there must be a basis element for the solution space whose z-th entry is non-zero). 

The following lemma states that every solution a G E to the equation a * y = b vanishes on U, 
the support of e. In the pointwise product notation, this is equivalent to showing that a * e = 0. 

Subclaim 21. For every a G E, b G N such that a * y = b, if holds that a * e = 0. 

Proof. Since a * y = b G N (by the assumption) and a * c G N (by item 1), 
we get that a*e = a*y — a*cis also a codeword in N. Furthermore, a * e is 
also supported on U, and since U is an erasure-correctable pattern in N, the only 
codeword that is supported on U is the zero codeword. □ (Subclaim) 

To finish the proof, we show that for any i ^ U, there is a solution a to the system of linear 
equations with a, = 1 . 

Subclaim 22. For every i ^ U there exists a G E, b G N such that a is 0 on U, a, = 1 
and a * y = b. 

Proof. By item 2, since U is correctable from erasures in N, for every i ^ U we can 
pick a G E such that a is 0 on U and a, = 1. Set b = a * y. It remains to be shown 
that b is a codeword of N. This follows from the fact that 

b = a*c-|-a*e = a*c, 

where the second equality follows from the fact that a is zero on U (the support of 
e). Finally, a * c is a codeword of N by item 1. □ (Subclaim) 

These two claims complete the proof of the theorem. □ 
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3.2 Decoding of Linear Codes over F 2 

In [ASW15], it is observed that their results for Reed-Muller codes imply that for every linear code 
N, every pattern which is correctable from erasures in N is correctable from errors in what they 
call the "degree-three tensoring" of N. One can in fact use our Algorithm 1 almost verbatim to 
obtain an efficient version of this statement. However, here we remark that this is nothing but a 
special case of Theorem 19 with an appropriate setting of the codes E, C, N. We begin by briefly 
describing their definitions and their argument. 

The basic tool used by [ASW15] is embedding any parity check matrix in the matrix E{m, 1) 
for an appropriate choice of m. Let N be any linear code of dimension k over F 2 and H be its parity 
check matrix. For convenience, we first extend N by adding a parity bit. This increases the block 
length by 1, does not decrease the distance and preserves the dimension. A parity check matrix 
for the extended code can by obtained from H by constructing the matrix 

/I 1---1 \ 

H.=h H 

V 0 / 

The main observation now is that E(m, 1) is an (m + 1) x 2'” matrix that contains all vectors of 
the form (1, v) for v G lEf, so if we set m = n — k to he the number of rows of H, we can pick a 
subset S of the columns of E(wi, 1) that correspond to the columns that appear in Hq. 

[ASW15] then define the degree-three tensoring of N, which is a code C whose parity check 
matrix is Ef®^: this is an ( ” 2 ) x n matrix with rows indexed by tuples ii < (2 < h, with the 
corresponding row being the pointwise product (as in Definition 18) of rows h,i 2 ,h of Hq. One 
can then verify that Algorithm 1 can be used in order to correct (in C) any error pattern which is 
correctable from erasures in N, by using the algorithm with r = 1 and having the error location 
guesses run only over the columns in S. 

A closer look reveals that this construction is in fact a special case of Theorem 19. Given any 
linear binary code N with parity check matrix El, the main observation of [ASW15] can be in¬ 
terpreted as saying that when we add a parity bit to N, we can embed N in a puncturing of 
RM{m, m — 2) (whose parity check matrix is E(m, 1)). We state it in the following claim: 

Claim 23. Let N' denote the subcode of RM{m,m — 2) of all words that are 0 outside S. Then N is 
precisely the restriction of N' to the S coordinates. 

Proof. Let b G N. Then Efob = 0, i.e. the columns of Efo indexed by the non-zero elements in b 
add up to 0. Let b' G FI” denote that extension of b into a vector of length 2"* obtained by filling 
O's in every coordinate not in S. Then E{m, l)b' = 0, since the same columns that appeared in Hq 
appear in E{m, 1). This implies that b' G N'. 

Similarly, for every b' G N', we can define b to be its restriction to S, and then Efob = 0, i.e. 
b G N. □ 

The degree-three tensoring of N, which we denote by C, can then be similarly embedded in a 
puncturing of RM{m, m — E), where again, only the coordinates in S remain, and similarly C can 
be seen to be the restriction to S to the subcode C of RM{m, m — A) that contains the words that 
are 0 outside S. 
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Finally, we define the error locating code E to be the restriction of RM{m, 2) to the coordinates 
of S. 

We now show that the conditions of Theorem 19 are satisfied in this case. We begin with 
item 2. If U is a correctable pattern in N, it means that the columns indexed by U in Hq are 
linearly independent. It follows that they are also linearly independent as columns in E{m,l). 
Hence, using the same arguments as before we can find, for any coordinate v ^ Lf, a degree 2 
polynomial g such that g(v) = 1 and g restricted to U is 0. Restricting the evaluations of g to the 
subset of coordinates S, we get a codeword e G E with the required property. 

As for item 1: We first argue that RM{m, 2) *C' C N', since the degrees match and the property 
of vanishing outside S is preserved under multiplication. Projecting back to the coordinates in S, 
we get that E * C Q N. 
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